5 research outputs found
Multi-objective optimization via equivariant deep hypervolume approximation
Optimizing multiple competing objectives is a common problem across science
and industry. The inherent inextricable trade-off between those objectives
leads one to the task of exploring their Pareto front. A meaningful quantity
for the purpose of the latter is the hypervolume indicator, which is used in
Bayesian Optimization (BO) and Evolutionary Algorithms (EAs). However, the
computational complexity for the calculation of the hypervolume scales
unfavorably with increasing number of objectives and data points, which
restricts its use in those common multi-objective optimization frameworks. To
overcome these restrictions we propose to approximate the hypervolume function
with a deep neural network, which we call DeepHV. For better sample efficiency
and generalization, we exploit the fact that the hypervolume is
scale-equivariant in each of the objectives as well as permutation invariant
w.r.t. both the objectives and the samples, by using a deep neural network that
is equivariant w.r.t. the combined group of scalings and permutations. We
evaluate our method against exact, and approximate hypervolume methods in terms
of accuracy, computation time, and generalization. We also apply and compare
our methods to state-of-the-art multi-objective BO methods and EAs on a range
of synthetic benchmark test cases. The results show that our methods are
promising for such multi-objective optimization tasks.Comment: Updated with camera-ready version. Accepted at ICLR 202
Predicting RP-LC retention indices of structurally unknown chemicals from mass spectrometry data
Non-target analysis combined with high resolution mass spectrometry is considered one of the most comprehensive strategies for the detection and identification of known and unknown chemicals in complex samples. However, many compounds remain unidentified due to data complexity and limited structures in chemical databases. In this work, we have developed and validated a novel machine learning algorithm to predict the retention index (ri) values for structurally (un)known chemicals based on their measured fragmentation pattern. The developed model, for the first time, enabled the predication of ri values without the need for the exact structure of the chemicals, with an R^2 of 0.91 and 0.77 and root mean squared error (RMSE) of 47 and 67 ri units for the NORMAN (n=3131) and amide (n=604) test sets, respectively. This fragment based model showed comparable accuracy in r_i prediction compared to conventional descriptor-based models that rely on known chemical structure, which obtained a R^2 of 0.85 with and RMSE of 67
Chemometric Strategies for Fully Automated Interpretive Method Development in Liquid Chromatography
The great potential gains in separation power and analysis time that can result from rigorously optimizing LC-MS and 2D-LC-MS methods for routine measurements has prompted many scientists to develop computer-aided method-development tools. The applicability of these has been proven in numerous applications, but their proliferation is still limited. Arguably, the majority of LC methods are still developed in a conventional manner, i.e. by analysts who rely on their knowledge and experience. In this work, a novel, open-source algorithm was developed for automated and interpretive method development of LC separations. A closed-loop workflow was constructed that interacted directly with the LC and ran unsupervised in an automated fashion. The algorithm was tested using two newly designed strategies. The first utilized retention modeling, whereas the second used the Bayesian-optimization machine-learning approach. In both cases, the algorithm could arrive within ten iterations at an optimum of the objective function, which included resolution and measurement time. The design of the algorithm was modular, so as to facilitate compatibility with previous works in literature and its performance thus hinged on each module (e.g., signal processing, choice of retention model, objective function). Key focus areas for further improvement were identified. Bayesian optimization did not require any peak tracking or retention modeling. Accurate prediction of elution profiles was found to be indispensable for the strategy using retention modeling. This is the first interpretive algorithm demonstrated with complex samples. Peak tracking was conducted using UV-Vis absorbance detection, but use of MS detection is expected to significantly broaden the applicability of the workflow
Chemometric Strategies for Fully Automated Interpretive Method Development in Liquid Chromatography
The majority of liquid chromatography (LC) methods are still developed in a conventional manner, that is, by analysts who rely on their knowledge and experience to make method development decisions. In this work, a novel, open-source algorithm was developed for automated and interpretive method development of LC(-mass spectrometry) separations ("AutoLC"). A closed-loop workflow was constructed that interacted directly with the LC system and ran unsupervised in an automated fashion. To achieve this, several challenges related to peak tracking, retention modeling, the automated design of candidate gradient profiles, and the simulation of chromatograms were investigated. The algorithm was tested using two newly designed method development strategies. The first utilized retention modeling, whereas the second used a Bayesian-optimization machine learning approach. In both cases, the algorithm could arrive within 4-10 iterations (i.e., sets of method parameters) at an optimum of the objective function, which included resolution and analysis time as measures of performance. Retention modeling was found to be more efficient while depending on peak tracking, whereas Bayesian optimization was more flexible but limited in scalability. We have deliberately designed the algorithm to be modular to facilitate compatibility with previous and future work (e.g., previously published data handling algorithms)